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Abstract— The huge amount of library data stored in our 
modern research and statistic centers of organizations is 
springing up on daily bases. These databases grow exponentially 
in size with respect to time, it becomes exceptionally difficult to 
easily understand the behavior and interpret data with the 
relationships that exist between attributes. This exponential 
growth of data poses new organizational challenges like the 
conventional record management system infrastructure could no 
longer cope to give precise and detailed information about the 
behavior data over time. There is confusion and novel concern in 
selecting tools that can support and handle big data visualization 
that deals with multi-dimension. Viewing all related data at once 
in a database is a problem that has attracted the interest of data 
professionals with machine learning skills. This is a lingering 
issue in the data industry because the existing techniques cannot 
be used to remove or filter noise from relevant data and pad up 
missing values in order to get the required information. The aim 
is to develop a stacked generalization model that combines the 
functionality of random forest and decision tree to visualization 
library database visualization. In this paper, the random forest 
and decision tree techniques were employed to effectively 
visualize large amounts of school library data. The proposed 
system was implemented with a few lines of Python code to 
create visualizations that can help users at a glance understand 
and interpret the behavior of data and its relationships. The 
model was trained and tested to learn and extract hidden 
patterns of data with a cross-validation test. It combined the 
functionalities of both models to form a stacked generalization 
model that performed better than the individual techniques. The 
stacked model produced 95% followed by the RF which 
produced a 95% accuracy rate and 0.223600 RMSE error value 
in comparison with the DT which recorded an 80.00% success 
rate and 0.15990 RMSE value. 


Keywords- Data Visualization, Decision Tree, Random Forest, 
Stack. 


I. INTRODUCTION 


Data visualization is the method employed to represent data 
that can help users understand and interpret the structure of 
data [1],[2]. It can help transform numerical and categorical 
data into a graphical or visual form that makes it easier for 
researchers to visualize outliers and hidden data trends [2],[4]. 
The insight about data patterns in a database may be unnoticed 
but when visualized using charts and graphs will be easier to 
view details about the behavior of data in a database as stored 
in tabular form [5]. The modern visualization tools help users 
to quickly understand, interpret data over time, and make 
necessary adjustments to different variables for decision 
making [6]. There are several data mining techniques adopted 
by Adediran and Ajibade[7] suitable to effectively and 
accurately visualize data stored in a multi-dimensional form 
using line, bar, pie, bubble, and donut charts[8]. The existing 
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methods of data visualization are faced with numerous 
challenges when it comes to complex and dynamic data 
structures and patterns with high dimensional space for 
interpreting the behavior of noisy, training, and testing data. 
The existing techniques lack merits and cannot be used to 
remove or filter noise from data and pad up missing values in 
order to get the required information. The Big data age can 
cause enormous growth in database size with respect to time 
which may affect the functionality of existing techniques. Data 
visualization can quickly help at a glance communicate details 
of data in a graphical form. 


This work aims to develop a stacked generalization model 
capable of combining the functionalities of random forest (RF) 
and decision tree (DT) to effectively visualize library data. The 
RF and DT technique is employed to effectively visualize a 
large volume of school library data for the purpose of decision 
making. The proposed system will be implemented using 
Python programming language that can learn from noisy data 
and help visualize detailed insight about the behavior of 
training and testing data with its relationship. The model will 
be developed, trained, and tested to learn and extract hidden 
data patterns using a cross-validation test. The output of the RF 
and DT model will be used as the new training dataset for the 
stack model at its base level. It combines the functionalities of 
both models to form a generalized stacking technique that can 
perform better than using the individual models with the help 
of a cross-validation test. The stacking stage will contain the 
RF at the top of the stack followed by the DT tree. This work 
will be useful to consultants, research libraries, institutions, and 
scholars. It provides user speed and accuracy with the ability to 
act on visual findings for better decision-making. 


The organization of this paper is divided into different 
sections as followings: section | contained the introduction, 
section 2 presents a brief review of previous approaches 
relating to the study area and the gap in exploring the proposed 
model; Section 3, introduces materials and methods employed 
for developing the model; Section 4, focuses on the results and 
detailed discussion of results; Section 5 presents the conclusion 
to the paper. 


II. RELATED WORKS 


Nazeer et al.,[9] compared different data visualization 
techniques with a sizable dataset obtained from a self-study 
questionnaire. From the analysis; 90% of the respondents are in 
favor of adopting data visualization techniques and 
recommended the use of modern data visualization tools like 
histograms, pie, and bar charts. Moore [10] reviewed the 
history of data management systems that leads to the problems 
of employing Bigdata visualization techniques that require 
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advanced machine learning visualization tools to provide better 
insight into data. The created platform was effective in 
communicating to the user relevant information necessary for 
making a better decision. Narayanan and Shanker [11] 
recommended the use of some selected data visualization 
techniques to research scholars in order have a_ better 
foundation of business intelligence. The visualization 
techniques include network diagrams, box plots, correlation 
matrix, donuts, pie and bar charts, and line and bubble plots. 
Plank and Helfert [12] discussed the adoption of interactive 
Bigdata visualization techniques to provide managers with 
better policies in making decisions. The research findings 
enhanced user understanding, and knowledge and provided 
valuable insights about data in the organization. Zhang [13] 
discussed about the hierarchical view of different authoring 
systems using the Kyrix platform with a decision tree for data 
visualization. It reduced the barriers of entries with five (5) 
different components: namely layers, canvas view, 
transformers and jumps for zooming. It provided a more 
interactive user platform with the possibility of adding new and 
edited icons (button). A graphic user interface was developed 
to help add layers with the help of a button click which helps in 
keeping track of the application scripts. The Kyrix platform 
could not be used to visualize data behavior, training, testing 
data, and relationships that exist between database attributes. 
Ogier and Stamper [14] proposed the use of modern data 
visualization tools in offering services to research libraries and 
research scholars. The design strategy follows a top-down 
approach that cut edges across concepts, refining, ethics, and 
layouts. The concepts deal with the audience and intend, 
refining used to tire pieces of data that relate to the whole. The 
design layout focuses on data visualization techniques, 
framework, and communication but there was no practical 
implementation. Chawla et al.[15] identified Bigdata 
visualization techniques into three (3) different groups namely: 
volume, variety, and dynamics of data. The decision tree was 
employed to visualize data in a hierarchical form using the 
divide and conquer technique with the help of a decision rule to 
insert sub-nodes into root and parent notes. The hierarchical 
model suffers from the limitations of distorted positive and 
zero-pixel values. Butavicius and Lee [16] carried out 
empirical research using four data visualization techniques to 
showcase the similarities of 50-unstructured short text 
messages stored in one-dimensional space. The greedy nearest 
neighbor with the ISOMAP technique was employed as a base 
representation to rank the one-dimensional list using two (2)- 
dimensional visualization and multi-dimensional data scaling 
(MDS). The MDS display was better than the ISOP technique 
and the 2-Dimension display over 1-Dimensional. But recorded 
significant variations on different dimensionalities of data 
storage. Gorodov and Gubarev [17] researched on different 
Bigdata visualization techniques in relation to noise, large 
perception, and misclassified patterns. The visualization 
techniques revealed the dynamics of changes that occurred in 
stored data over time in terms of volume, format, and 
dimension using the Tree-Map method. But the model suffers 
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from varying scales and selection of suitable methods among 
others for Bigdata analysis and visualization. Tay et al.,[18]. 
Proposed the use of data visualization techniques in 
organizational research for Bigdata analysis considered to be in 
high volume and shows how responses are graphically 
represented using line plots and charts. But identified issues of 
data integration that help combine different data modes in 
revealing details about interest or phenomenon and interactivity 
to uncover and identify hidden and new data patterns. 


HMI. MATERIALS AND METHODOLOGY 


This work focuses on the use of RF and DT methods of 
visualization with the stacking concept to combine the 
functionality of both methods in forming a better technique that 
performs better than the individual models. A cross-validation 
test is employed to help learn with noisy data and generalize 
well using the testing dataset. This can help overcome the 
problem of model over-fitting. The proposed data visualization 
techniques like a heat map, functional graph, ROC, AUC, RF, 
DT, error bar plots, and bar charts are employed with machine 
learning (ML) visualization libraries in Python. The bar charts 
are used to showcase the training and testing dataset, and error 
bar plots with ML model for sensitivity rate against different 
data samples. The Heat map as a diagnostic tool is used to 
visualize the correlation between attribute pairs and missing 
values measured in two dimensions. The functional graph 
provides a clear insight into training, testing, and validation set 
over noisy and missing data values. The Area under curve 
(AUC) graph visualizes the variation of model performance or 
learning rate using training and cross-validation test in terms of 
accuracy over sample data size. The decision tree is used to 
organize data in a normal tree-like structure with nodes and 
sub-nodes (leaf nodes) using the decision rule. The tree size 
after construction is always proportional to the data values or 
points. The receiver operating characteristics (ROC) is used to 
visualize the performance and the relationship between hit rate 
and false-positive rate in revealing algorithm tradeoff. 


A. Dataset 
The experimental data was sourced from a public library 
made available on data.world "Source: 


https://data.world/datasets/public-library" containing attributes 
of Book_ID, Title, ISBN, ISBN1, Average ratings, Number of 














pages, Rating values, language used, publisher name, 
Publication date, and the Authors name, etc. as shown in Table 
BOX 45540 
I. The dataset was divided into 80% ( 3200 =36512) training 
207 45640 
and 20% ( 109 =9128) testing sets with data attributes. 
TABLE I. LIBRARY BOOK INFORMATION SYSTEM DATASET 
Title --- Publisher Ratings 
0 Harry Potter and the --- 
Half-Blood Prince Scholastic 209569 
(Harry Potter_#6) Inc. 0 
1 Harry Potter and the --- 
Order of the Phoenix Scholastic 215316 
(Harry Potter_#5) Inc. 7 
2 Harry Potter and the --- Scholastic 6333 
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Chamber of Secrets 
(Harry Potter #2) 
45637 --- W. W. 
The Picture of Dorian Norton & 
Gray Company 663 
45638 --- Oxford 
The Picture of Dorian University 
Gray Press USA 1089 
45639 An Arab-Syrian --- 
Gentleman and 
Warrior in the Period 
of the Crusades: Columbia 
Memoirs of Usamah University 
Ibn-Munqidh Press 68 
45640 Technical Manual and --- 
Dictionary of Classical Dover 
Ballet Publications 393 
45641 The Ballet 
Companion: A 
Dancer's Guide to the 
Technique Traditions 
and Joys of Ballet Touchstone 524 





B. Data Transformation 


Data transformation is the advent of converting row data 
into a format, and structure that is suitable for model building 
using the Load, transform, and extraction techniques [19], [20]. 
The transformation stage help normalize numeric features to 
produce a better model and allows regression techniques like 
the random forest to have nonlinearity features in its forest 
space [21]. 


C. Feature Extraction 


Feature extraction is mandatory for any application that 
involves relevant feature identification from a database. It is a 
process of extracting features to assist the task of classifying 
data patterns [22]. This phase is considered to be important 
because it can influence classification and regression tasks in 
adopting ML tools. 


D. Classification 


Classification is one of the keys and useful components 
involved in the decision-making process that categorizes data 
based on some observed features using particular criteria [23]. 
The row and class-feature sampling techniques are used for 
each and every decision tree in the forest to reduce bias, noise, 
and high variance [24]. The change in the input dataset caused 
low variance in the tree and accuracy with majority votes for 
the binary classification model [24]. 


E. The Decision Tree 


The construction of DT follows the concept of divide and 
conquer by splitting the source dataset into subdivisions called 
sub-nodes based on the test value [26],[27]. And this process is 
repeated on each subset and recursion is completed and 
terminated when splitting no longer adds value to the 
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predictions. In DT, data arrives in the form of records [28],[29] 
written as: 


(x, Y) =Car Ep m i F) 1 


Where Y is the dependent variable referred to as the target 
variable while x is the independent variable with a vector 
containing the input variables X1, X2, X3 ..... , Xn etc. The DT 
uses a decision rule characterized by an ordered pairs [30],[31]. 


F. RF Regression Model 


The scaled training and testing dataset was feed to the fitted 
RF model using the regression class and classification of the 
sklearn ensemble library, trained and tested with specific 
number of n_estimators (decision-trees) as fine-tuned 
parameters and random states set to be 0, maximum number of 
leaf node to 100 and n_job parameter = -1 to obtain an optimal 
solution with the adjustable parameter to clearly visualize 
database relationships, noisy data content and over-fitting 
features. The mode training process was done using Python 
code: model.fit (x_train, y_tes data set). The output of the RF 
model was then fed as input to the stacking model using a 
cross-validation test to reduce and filter the noisy data content 
from the database system. The nsemble library was employed 
to classify objects into similar groups in constructing multiple 
decision trees in the forest space. And proposed to fine-tune or 
adjust the n_estimators in the forest class. 


G. The Stacking Model 


It is also called stacked generalization is a technique used to 
combine the functionality of two or more techniques to form a 
model that performs better than using the individual learning 
models. The two different learners are combined to build an 
intermediate prediction model, one prediction for each learning 
model that learns from intermediate patterns with the same 
target variable. The final model is said to be stacked using the 
RF on top of DT. It improves the overall performance and 
often ends up performing better than individual intermediate 
models at the base level as shown in Figure 1. 


Level Stacked) 


(Based Model) — 


Modal 





Figure 1. The Stacking Model 


The stacked model architecture combined RF and DT techniques 
often referred to as the base model at level-0. The 
functionalities of both are used as at the base level to build a 
meta-learning model called the stack. The random forest is 
placed at the top of the stack followed by a decision tree as 
shown below in the code segment using Python programming 
language. 
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mlxtend.regressor 
stack = StackingCVRegressor(r 


shuffle= 
random_state= 
train, y_train) 





H. New Training Data 


The training dataset was divided into k-folds cross 
validation and fitted using the base model on k--1 path of the 
whole training set to compute its performance using the test 
dataset with predictions made for the kth part. This process is 
repeated and predictions from training set are used as features 
for training the stacked model and used to visualize the testing 
dataset. 


I. Meta Model 


Meta model at level-1 is a meta-layer that accepts output 
from the base models (first level) as the new training data. The 
stacking regressor class was invoked from mlxtend library that 
contains the stacking cross validation regressor in Python and 
RF stacked on top DT model. The meta_regressor was set to 
point at RF, CV=4, use_features in  secondary=rue, 
shuffle=False, random_state=42) and stack build with the code 
sedment: Stack.fit(x_test, y_train). 


Algorithm 1: The RF Algorithm 





Step Processes involved 





1 Start 

2 Assume cases in the training set to be C and randomly select cases 
with replacement 

3 If there are inputs of M features with a variable representing 


number m<M. Then Take the best split on the node and peg the 
value of "m" to be constant 


4 Grow each decision-tree to have the largest possible size without 
pruning 

5 Use majority vote for classification and mean for regression task 

6 Stop 





Algorithm 2: Decision Tree (DT) 





Step Processes involved 





1 Start 
2 Compute class frequency value (CCFT) and return a leaf_Node 
3 Create a decision tree of N nodes 
4 Loop through Each Attributes of A to ComputeGainValue(A 
5 N(test) = Best_attribute_Gain 
6 If N(test) is continuous 
7 For Each CFT' in the splitting of CFT 
8 (a). If CFT is Empty Then 
Child_Nnoae is a leaf Node 
Else: 
Child_Nnoae=Form_Decision_Tree(DT') 
9 Compute_Error_Nwoae 
return Nnode 





This is the final step used to measure model's performance 
in terms of accuracy, root mean square error (RMSE), standard 
deviation, mean score, sensitivity values, ROC and AUC to 
adequately measure the performance of the proposed system 
for comparison. 

TPN 
Accuracy= TP+TN+P+FN 3 
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Where TN represents true negative, FP is false positive, TP 
is true positive and FN is false negative cases. 


Sensitivity is the ratio of the number of correctly classified 
positive classes of data computed using a function in Python as 
given in equation 3 as: 

TP 
Sensitivity = TP+FN 4 


J. Standard Deviation 


Standard deviation is the measure of dispersion for set of 
numerical values. It basically computes the square root for the 
spread of 'x' data distribution from the average point. It is a 
computation showing how far data from the average or mean 
point. 









1 N 


Ta Sidt ay 


Standard deviation (S.D) =¥ 5 


Where N is the total dataset, “i individual observations in 
the sample dataset and ¥ is the sample mean [32]. The standard 
deviation was computed using the sqrt-function from math 
module of Python standard library with the stdev-function that 
takes data from a population and returns its standard deviation. 


IV. RESULTS AND DISCUSIONS 


The results of classification and regression analysis are 
presented and discussed in detail using suitable ML 
visualization tools. The design and implementation were done 
with some varying fine-tuned hyper-parameter values to 
provide better insight about library data. A dataset was 
generated with some recursive distribution of noisy data using 
piecewise function [f(x)] given as: 


a 
g x-2) 


f= 4.2.5 2 


Where f(x) is the piecewise function and x is the training 
and testing samples. A function is created to distribute noise 
within the n_samples across the interval -4 to 4 in visualizing 
training and testing data behavior. The predictions and 
classification accuracy of both models are visualized, presented 
and discussed using cross validation curve, neighborhood 
graph, standard deviation and mean scores reported as given 
bellow: 


Total No. of Datasets 2ppo 








35000 


30000 


25000 


20000 


15000 


10000 


5000 








Total No. of Items 


o4 
Testing Dataset(20%) Training Dataset(80%) 
Dataset 


Figure 2. The Total Number of Datasets 
Figure 2 depicts the total number of training and testing 


dataset for the proposed model. The dataset was divided into 
80% training and 20% testing and the model trained using 
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training dataset and the performance evaluated with the test 
dataset. 


- 1.0 
bookID 


average_rating 
isbni3 
num_pages 
text_reviews_count 


0.2 


Copies - ~ 


0.0 
ratings - -0. 





bookiD - 
average _rating - 2 
num pages- <£ 

text_reviews_count 
Copies - 8 

ratings 


Figure 3. The Correlation Matrix 


Figure 3 is the correlation matrix used to measure the 
relationship between database attributes(variables). The matrix 
depicts a linear correlation between all possible pairs of 
Book_ID, Average ratings, ISBN number, text reviews count 
and rating values. There is a positive and negative correlation 
between attributes in the database as shown in the main 
diagonal and other pairs as recorded above and below the main 
diagonal. The positive correlation values indicate that, the 
independent and dependent variables move in opposite 
direction while negative correlation; shows that, both variables 
are moving in same direction. 


x[12] <= 0.5 
squared _error = 31242363846.828 
samples = 2500 
value = 35608.514 








squared _error = 1314015878.779 
‘samples = 2409 


value = 12257.79 


squared ¢ som a 4900900441347) 
L _vahe™ mien2s0.542 


Figure 4. Decision Tree Structure for Data Visualization. 








The above decision tree is generated using the 
If...Then...else rules given as: If (VWalue<=Node): 
Attach_to_Left else:Attach_to_Right Endif. 


Figure 5. Tree Structure of Library Book Details 


Figure 5 is a single DT of higher depth constructed and 
grown from Figure 4 using the decision rule. The splitting is 
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done based on the value and clearly defined the position of 
inserting sub nodes from the root. Figure 5 is the tree structure 
generated from the system dataset. The model randomly selects 
patterns from original set in growing the tree and variables to 
represent random subset at each step. This generated a tree of 
height five (5). 


























Figure 6. Forest Structure of Library Book Details 


Figure 6 is the random forest generated from the proposed 
training dataset which combined the simplicity of different 
decision-trees through voting that resulted in high accuracy 
using library data. This was done by choosing samples 
randomly from the original set with replacement to grow trees 
and variables which represents random subset at each step. This 
resulted in a wide variety of forest trees. 


Decision tree regressor, MSE = 0.01 





LEGEND 


== Testing Data 
== Training Data 


"DT Pred 






15 


É 
So 


Function(x) 
= 
Ùa 


> 
> 











Interval 


Figure 7. Training, Testing and Validation Set of Data Behavior 


Figure 7 is the learning curve of DT model. The tree learns 
too fine details of the training data and noise = 0.2 injected 
through a smoothening piecewise function but overfitting 
occurred as shown in the predicted obtained from testing set. 
The results of DT as shown in Figure 7 could not generalize 
well on testing dataset but worked perfectly well with training 
set of 0.01 mean square error values within the intervals of -4 
to +4. 


www.ijeacs.com 6 


Stanley Ziweritin 


Random forest regressor, MSE = 0.00 





15) LEGEND 


== Testing Data 
== Training Data 















X10; * DTpred 
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[°] 
Sos 
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=] 
We 

0.0 

-4 -2 0 2 4 
Interval 


Figure 8. The Functional Graph of RF 


Figure 8 is the RF functional graph showing the behavior of 
testing, training and validations set and no over-fitting was 
observed even with the presence of noisy data. The training 
data pattern fluctuates at certain points and matches target at 
other points caused. 


Validation Curve With Random Forest 


Accuracy Score 


— Training score 
— Cross-validation score 


o 5 100 150 200 250 
Number Of Trees 


Figure 9. The Learning Curve of RF 


Figure 9 is the visualized performance of RF model over 
range of fine-tuned hyper-parameter values (number of trees). 
The cross-validation curve increased and moved steadily to the 
right from 0.9 at the x-axis which is lower than the training 
score at every point. 


Learning Curve of Decision Tree 


og 


Accuracy Score 


a6 





=- Training score 
as — Cross-validation score 


æ 100 


Training Set Size 


Figure 10. The Learning Curve of DT 


Figure 10 depicts the learning curve of DT validation and 
training set. The cross-validation score increased at the initial 
state, degreased and increased at the final state in a fluctuating 
pattern and training score measured to be constant as point 1.0 
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at the x-axis. The training score is higher than the validation 
score that lies constant at 1.0 point at the x-axis along different 
training set with increase in size. 


The Sensitivity errorbar plot 


Satndard Deviation(E-03) 











o 10000 20000 30000 40000 50000 
Data Sample 


Figure 11. The Sensitivity Graph of Data over Sample Size 

Figure 11 depicts the sensitivity analysis of DT and RF 
techniques used as a metrics to visualize the variation and 
relationship that exist between model performances and data 
sample. 


The Sensitivity errorbar plot 





—# Random Forest 
—# Decision Tree 


350 


8 8 8 


Means Score(E-03) 


8 


700 











10600 2 00 40000 50000 


0000 300 
Data Sample 


Figure 12. Mean Sensitivity Graph of Data over Sample Size 


Figure 12 is the sensitivity plot used as a graphic 
representation of DT and RF model's performances in plotting 
the mean score values against different data sample. 


Total DT Impurity Vs Effect 


Total 


.000 a.005 0.010 0015 0020 0025 a030 a035 


effective alpha 


Figure 13. Total DT Impurity Vs Effect 

Figure 13 is the DT graph of total impurity against effective 
alpha for the training dataset to reduce over-fitting using the 
minimal recursive cost complexity pruning. The effective 
values of alpha variable recorded low variance and links with 
the least effective alpha are pruned first. The values increased 
along with respect to the training data in a step wise manner. 
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The ROC curve of R_forest and D_tree 


10 ——_— — Á [Á Á [Á Á Á Á Īle i 
= = 
a ae 
4 ar 
T I 7 Bae 
s 
g 1 / = 
k= I 4 pe 
x 

L os I 4 4 
= 3 4 -- 
a = 4 ee 
S os | 1 -s 
Q o4 zan 
g I 4 Bee 
È I 1 =S 

12 I ra a= 

a 
ls 
- — D_Tree 
1 | CU — Forest 
oo 02 a6 os 10 


a4 
False Positive Rate 


Figure 14. The ROC Curve of RF and DT 

Figure 14 is the receiver operating characteristic (ROC) 
graph of RF and DT showing the trade-off between sensitivity 
or true positive rate and specificity(1-FPR). The RF ROC curve 
is wider and closer to top-left corner of the graph which 
performed better than the DT. The proposed system revealed 
the behavior of points lying along the diagonal (True Positive 
Rate=False Positive Rate) as expected with high accuracy rate. 








TABLE II. COMPARING DT, RF AND STACKED TECHNIQUES 
MODEL ACCURACY (%) RMSE Sensitivity 
DT 70.0 0.5477 0.8333 
RF 85.0 0.3873 1.0000 





Table II shows the performance of RF, DT and Stacked 
model in terms of accuracy and RMSE. That stack model 
recorded 95.0% accuracy level as the highest with 0.0987 
RMSE, RF technique produced 85% and 0.3873 error rate 
compared to 70.0% and 0.5477 error rate produced by DT. This 
shows that the stacked model performed better in terms of 
accurate rate, followed by the RF and DT producing the least 
performance rate. 


V. CONCLUSION 


The proposed ML visualization techniques will help user 
understand database information much easier when presented 
in a visual form compared to when using the existing system 
tools. The selected ML models used in visualization makes it 
so appealing and artistic to interpret data and its composition 
for the purpose of better decision-making task. There are 
different and several visualization techniques adopted but some 
of these techniques may lead to wrong and poor data 
visualization. This is important in choosing the appropriate 
visualization technique to better understand and interpret the 
data for future use. The stacked model performed best followed 
by RF and the visualization results of RF technique 
outperformed the DT model in term of accuracy, error rate, 
behavior of training data containing noise and the model's 
learning rate with testing dataset. 
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